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ABSTRACT: This paper describes doctoral research that focuses on the development of a learning 
analytics framework for inquiry-based digital learning. Building on the Community of Inquiry 
model (Col) — a foundation commonly used in the research and practice of digital learning and 
teaching — this research builds on the existing body of knowledge in two important ways. First, 
given that the Col model requires substantial manual coding of student discourse, its potential 
for guiding pedagogical interventions are limited. Thus, the first contribution is the development 
of a learning analytics system that automates this coding process by means of a novel text- 
classification algorithm that takes into the account the process nature of inquiry-based learning 
and the specifics of communication through asynchronous discussions. Furthermore, it is equally 
important to investigate how learning processes unfold over time through student interactions 
with information, technology, and other course participants. With this in mind, the second 
contribution of this research focuses on the development of analytical models that provide 
insight into these important aspects of inquiry-based learning. 

KEYWORDS: Learning analytics, community of inquiry, quantitative content analysis, social 
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1. INTRODUCTION 

In distance education, the primary means of social interactions are asynchronous discussion boards. Not 
only do they facilitate help seeking and information sharing, but they also promote the development of 
learner communities and critical thinking skills (Anderson & Dron, 2010). Over the years, many different 
models of online learning focusing on the use of asynchronous online discussions have been developed, 
with the Community of Inquiry (Col) model (Garrison, Anderson, & Archer, 1999) being one of the most 
researched and adopted models. The model posits that the educational experience of members of a 
community of inquiry is shaped by the dimensions of 1) teaching presence, 2) social presence, and 3) 
cognitive presence. Cognitive presence is the central construct in the Col model, operationalized 
through four phases ranging from the initial problem conception to its resolution (Garrison et al., 1999). 
To support the assessment of these three dimensions, instruments for quantitative content analysis and 
self-reported surveys have been developed and validated. 

An important challenge of the Col model relates to its practical use and the scaling up of its research. 
The Col model is primarily adopted using coding schemes for each of its three presences (Garrison et al., 
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1999). Given that quantitative content analysis requires significant time and manual coding work, the 
Col model has been primarily used for retrospection purposes and not for real-time guidance of 
instructional interventions. Likewise, the existing sample sizes are proportionally small, which hampers 
its use at large scale. Existing systems for automated coding of discussion messages developed for the 
Col dimensions (Corich, Hunt, & Hunt, 2012; McKlin, Harmon, Evans, & Jones, 2002) do not provide 
sufficient levels of accuracy to enable their practical application. Thus, one of the goals of this doctoral 
research is to develop a learning analytics system that enables more efficient message coding by means 
of a novel text-classification algorithm to take into account the specifics of inquiry-based learning and 
quantitative content analysis. Furthermore, the development of such a system entails identification of 
important surface-level features of online communication, which can give more detailed 
operationalization of the different indicators of the three dimensions of the Col model. 


It is equally important to understand how the learning process in a community of inquiry unfolds 
through the interaction of learners with other learners, technology, and information (i.e., content). 
Having this in mind, in this doctoral research, we focus on two areas. First, we look at the social 
interactions of learners using social network analysis (SNA) to examine the role of the Col dimensions on 
the development of student social relationships and social capital. Second, we look at the interaction of 
learners with technology and content in order to understand the role of student agency and educational 
technology use on the outcomes of student learning processes. At present, only the study by Shea et al. 
(2010) has examined the use of social network measures in the context of Col research. In similar 
manner, only a recent study by Rubin, Fernandes, and Avgerinou (2013) has looked at the relationship 
between students' perceived value of technological affordances and the constructs of the Col model. 
With the goal of providing insight into the online learning phenomena, our research focuses on 
examining relationships between the above-mentioned constructs and the three Col presences. 


2. METHODS 


The primary means of conducting presented research are quantitative methods and the investigation of 
empirical data from real-world, fully online, blended, and massive open online courses. After the 
literature review, the doctoral research includes collecting trace data and online discussion transcripts 
from several courses and manually coding the messages in accordance with the cognitive-presence 
coding scheme. This data is then used for the development of a text analytics system for 
(semi)automated message coding. Likewise, social network analysis is used to investigate the 
development of students' social networks and social capital, while clustering based on trace data will 
provide insight into student differences in terms of agency and educational technology use. 

3. RESULTS 


In the first study, we examined the use of an SVM classifier and different classification features as 
described by Kovanovic, Joksimovic, Gasevic, and Hatala (2014a) for automating the coding of cognitive 
presence. While our results indicate several classes of features useful for this classification task, the 
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results also indicate the need for a classification technique that better supports the iterative, process 
nature of inquiry-based learning and the rules for using qualitative content analysis instruments. For 
example, when a message shows indicators of several phases of cognitive presence, rules of cognitive 
presence coding scheme state that a message should be coded to the highest exhibited phase. This is a 
challenge for typical text classification algorithms that use surface based-features (e.g., N-grams), as the 
classification is accomplished without a regard for the feature position within the message. 

The study by Kovanovic, Joksimovic, Gasevic, and Hatala (2014b) looked at the relationship between 
social presence and social network position and revealed the critical role of interactivity in student 
communication in the development of social capital. Likewise, the current research looking at student 
differences in educational technology use found six profiles associated with significant differences in the 
levels of cognitive presence, which are primarily formed around differences in 1) the focus on 
technology use (content vs. discussion), and 2) levels of active discussion participation. 


4. CONTRIBUTION TO LEARNING ANALTYICS 


From a practical perspective, the current research makes the following contributions: 1) it allows 
educational researchers and practitioners to adopt the Col model more easily, without the need to 
manually code messages, 2) it increases sample sizes due to faster discussion analysis procedures, 3) it 
enables development of systems for real-time monitoring of learning activities, and iv) it provides 
opportunities for data-driven guidance of instructional interventions, such as moderation of discussions 
to support learning outcomes better. Likewise, through investigating the role of the three Col presences 
in different socio-technological interactions, this research can provide insight into learning within 
communities of inquiry from a novel perspective, not currently explored in detail. Finally, this research 
will enable the adoption of the Col model in other learning contexts — such as MOOCs — where there 
are important pedagogical differences not fully addressed by current research into the Col model. 
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